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Abstract 

We prove that every integer n > 10 such that n jk 1 mod 4 can be 
written as the sum of the square of a prime and a square-free number. 
This makes explicit a theorem of Erdos that every sufficiently large 
integer of this type may be written in such a way. Our proof requires us 
to construct new explicit results for primes in arithmetic progressions. 
As such, we use the second author’s numerical computation regarding 
GRH to extend the explicit bounds of Ramare-Rumely. 


1 Introduction 

We say that a positive integer is square-free if it is not divisible by the square 
of any prime number. It was proven by Erdos \J\ in 1935 that every sufficiently 
large integer n ^ 1 mod 4 may be written as the sum of the square of a prime 
and a square-free number. The congruence condition here is sensible. If n = 1 
mod 4 then 4 |(n — p 2 ) for any odd prime p. This only leaves the case p — 2, 
but n — 4 fails to be square-free infinitely ofteif]. 

^Tor example, one can consider the congruence class 13 mod 36. 
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It is the objective of this paper to make explicit the proof provided by 
Erdos, to the end of proving the following theorem. 

Theorem 1. Let n > 10 be an integer such that n ^ 1 mod 4. Then there 
exists a prime p and a square-free number k such that n = p 2 + k. 

In a recent paper [5j, the first author proved that every integer greater than 
two can be written as the sum of a prime and a square-free number. One can 
think of such a result as a weak-but-explicit form of Goldbach’s conjecture. 
Theorem [T] is significantly stronger than this, for the sequence of squares of 
primes is far more sparse than the sequence of the primes. To prove Theorem 
CD we combine modern explicit results on primes in arithmetic progressions 
and computation. 

The proof may be outlined as follows. For any integer n satisfying the 
conditions of the above theorem, we want to show that there exists a prime 
p < y/n such that n — p 2 is square-free. That is, we require some prime p such 
that 

n — p 2 ^ 0 mod q 2 

for all odd primes q < >/n. The idea is to consider, for some large n and each 
odd prime q < y/n, those mischievous primes p that satisfy the congruence 

n = p 2 mod q 2 . 

Then, for each q we explicitly bound from above (with logarithmic weights) 
the number of primes p which satisfy the above congruence. Summing over 
all moduli q gives us an upper bound for the weighted count of the so-called 
mischievous primes 

lo &p- 

q<y/n p<y/n 

n=p 2 mod q 2 

It is then straightforward to show that for large enough n, the above sum is 
less than the weighted count of all primes less than y/n, and therefore there 
must exist a prime p < y/n such that n — p 2 is not divisible by the square of 
any prime. 

This method works well, and allows us to prove Theorem |T] for all integers 
n > 2.5-10 14 that satisfy the congruence condition. We eliminate the remaining 
cases by direct computation to complete the proof. 
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2 Theorem H for large integers 

2.1 Case 1 

We start by considering integers in the range n > 2.5 • 10 14 such that n ^ 
1 mod 4. As usual, we define 

9(x-,k,l)= ^2 lo S R 

p<x 

p=l mod k 


where p denotes a prime number. 

The paper of Ramare-Rumely TO; provides us with bounds of the form 


6{x\ k,l ) 


x 

(p(k) 


<e ' k ’ Xa) ^k) 


and 

< oj(k, 

for various ranges of x > xq and x < x± respectively. These computations were 
in turn based on Rumely’s numerical verification of the Generalised Riemann 
Hypothesis (GRH) [12J for various moduli and to certain heights. Since then, 
the second author has verified GRH for a wider range of moduli and to greater 
heights [2j. For our purposes, we rely only on the following: 

Lemma 2. Let q be a prime satisfying 17 < q < 97. All non-trivial zeros p of 
Dirichlet L-functions derived from characters of modulus q 2 with 7 sp < 1000 
have =1/2. 

Proof. See Theorem 10.1 of [9]. □ 

We can therefore extend the results of Ramare-Rumely with the following 
lemma: 



0{x\k,l) - 7 TT 

P{k) 


Lemma 3. For x > 10 10 we have 


6(x;q 2 ,l) 


x 

ip(q 2 ) 


<<q\ 10 10 ) 


x 

(p(q 2 ) 


for the values of q and e(q 2 , 10 10 ) in TableUl 


3 













Proof. We refer to [10]. The values for q G {3,5,7,11,13} are from Table 1 
of that paper. For the other entries, we use Theorem 5.1.1 with H x = 1000 
and Ci(x,H x ) = 9-14 (see display 4.2). We set m = 10 for q < 23, m = 12 
for q > 47 and m = 11 otherwise. We use 5 = 2 e/H x and for A x we use the 
upper bound of Lemma 4.2.1. Finally, for E x we rely on Lemma 4.1.2 and we 
note that 2 • 9.645908801 ■ log 2 (1000/9.14) > log 10 10 as required. □ 


Table 1: Values for e(g 2 ,10 10 ). 


Q 

e(q 2 , 10 1U ) 

q 

e(g 2 ,10 1U ) 

q 

e(g 2 ,10 1U ) 

q 

e(g 2 ,10 1U ) 

3 

0.003228 

19 

0.17641 

43 

0.95757 

71 

2.82639 

5 

0.012214 

23 

0.25779 

47 

1.15923 

73 

3.00162 

7 

0.017015 

29 

0.41474 

53 

1.50179 

79 

3.56158 

11 

0.031939 

31 

0.47695 

59 

1.89334 

83 

3.96363 

13 

0.042497 

37 

0.69397 

61 

2.03488 

89 

4.61023 

17 

0.14271 

41 

0.86446 

67 

2.49293 

97 

5.55434 


Lemma 4. We have 

u (3 2 ,10 10 ) = 1.109042, 
u (5 2 ,10 10 ) = 0.821891, 
u (7 2 ,10 10 ) = 0.744132, 
w(ll 2 ,10 10 ) = 0.711433 

and 

oj (13 2 ,10 10 ) = 0.718525. 
If q is a prime such that 17 < q < 97 we have 

log 7-p 2 T 

ca(g 2 ,10 10 )= r f p{q) . 


Proof. The results for {3 2 , 5 2 , 7 2 , ll 2 ,13 2 } are from Table 2 of [TO] with a slight 
correction to the entry for 5 2 . A short computation shows that the maximum 
occurs for all of the other q when x = 7 and a = 7. □ 


Lemma 5. Let T = \/2.5 • 10 14 . Then for x > T and q <97 an odd prime we 
have 


d(x]q 2 ,l) - 


x 


(p(q 2 ) 


<e(q\T ) 


where the values ofe(q 2 ,T ) are given in Table\2 1 


x 

wr 
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Proof. Using c o(q 2 , 10 10 ) we have 


0(T-q\l) 


T 

p{q 2 ) 


< uj (q 2 , lO 10 ) VT 


so for x G [ T , 10 10 ] we have 


d(x-,q 2 ,l) - 


x 


ip(q 2 


< 


and so we can take 


(q 2 ,T) = max (e (q 2 , 10 10 ) , 


u (q 2 , 10 10 ) (p{q 2 ) x 

VT p(q 2 ) 

u {q 2 , 10 10 ) p(q 2 ) 


VT 


□ 


Table 2: Values for e(q 2 ,T) for Lemma [5] 


q 

e(q 2 ,T) 

q 

e(q 2 ,T) 

q 

e(q 2 ,T) 

q 

e(q 2 ,T) 

3 

0.00323 

19 

0.17641 

43 

0.95757 

71 

2.82639 

5 

0.01222 

23 

0.25779 

47 

1.15923 

73 

3.00162 

7 

0.01702 

29 

0.41474 

53 

1.50179 

79 

3.56158 

11 

0.03194 

31 

0.47695 

59 

1.89334 

83 

3.96363 

13 

0.04250 

37 

0.69397 

61 

2.03488 

89 

4.61023 

17 

0.14271 

41 

0.86446 

67 

2.49293 

97 

5.55434 


Let n > 2.5 • 10 14 be such that n ^ 1 mod 4 and consider the case where q 
is an odd prime such that q < 97. We want to bound from above the number 
of primes p < Vn satisfying 


n = p 2 mod q 2 . (1) 

Clearly, p can belong to at most two arithmetic progressions moduluo q 2 . 
Therefore, by Lemma [5l we can estimate the weighted count of such primes 
as follows. 

log P < °(Vn-, q 2 , l ) + 0(y/n\ q 2 , l') < ^ + . ^ ^ V™ 

q{q ~!) 

p<y/n 

n=p 2 mod q 2 
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where l and l' are the possible congruence classes for p and e(g 2 , T) is given in 
Table [21 Summing this over all 24 values of q gives us the contribution 

E E logp < 0.568\/^- (2) 

(?E{3,...,97} p<y/n 

n=p 2 mod q 2 


2.2 Case 2 

We now consider the case where 97 < q <n c and c G (0,1/4) is to be chosen 
later to achieve an optimal result. Montgomery and Vaughan’s [Bj explicit 
version of the Brun-Titchmarsh Theorem gives us that 

- <p(fc) log(a;/fc) 

for all x > q. Trivially, one has that 


0(Vn-,q 2 ,l) < 


n logn 


q(q - 1) log(^/g 2 )' 


As q < n c , it follows that 


E E ***<■& E ^ t y 

97 <q<n c p<y/7l 4 97 <q<n c ' 

n=p 2 mod q 2 


( 3 ) 


We can bound the sum as follows: 


E 


- < \ - , 

97 <q<n c ^ 97<<J<1000001 ^ n>1000001 ^ 


E 


E 

ooc 

1 


97<<J<1000001 

Substituting this into ([3]) gives us that 


q(q - 1) 1000000 


< 0.00183. 


x X , 0.00183x/n 

E E lQ g p< i_ c • 

97 <q<n c p<y/n 4 

n=p 2 mod q 2 


( 4 ) 
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2.3 Case 3 


Let q be an odd prime such that n c < q < Ayfn and A £ (0,1) is to be chosen 
later for optimisation. Since there are at most two possible residue classes 
modulo q 2 for p, the number of primes p such that n = p 2 mod q 2 is trivially 
less than 

'A + 1 

q z 

Clearly, including our logarithmic weights one has that 

Y log p < + 1^) log n 


p<y/n 

n=p 2 mod q 2 


and so 


E E logp < ^Jn\ogn —2 + log(n) 

n i 


n c <q<Ay/n p<\/n 

n=p 2 mod q 2 


m>n c 


where n(x) denotes the number of primes not exceeding x. The sum can be 
estimated in a straightforward way by 

V 1 1 l°° 1 d - 1 1 

2—* ^2 < ^2c _ ^2c ~c 


m>n c 


7l~ 


TT 


and Theorem 6.9 of Dusart [6] gives us that 


n(AX~i)< , + 

\og(Ayfti)\ 


1.2762 \ 
log {Ay/n))' 


Therefore, putting this all together we have 


, ,—/ _ 2 C —c\ i Ayfn log n f 1.2762 \ 

Y Y log p<Vn{n 2 +n logn + -— ^ 1 + -— -= ■ 

r “t- log(TLyn) V log(Av^) / 

c <q<Ay/n p<y/n 

n=p 2 mod q 2 

( 5 ) 


2.4 Case 4 


Finally, we consider the range Ay/n < q < yfn. 
then 

n = p 2 + Bq 2 


If n — p 2 is divisible by g 2 , 


( 6 ) 
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for some positive integer B < A~ 2 . We will need some preliminary results 
here. First, it is known by the theory of quadratic forms (see Davenport j4J 
Ch. 6]) that the equation 

ax 2 + by 2 = n, 

where a, b and n are given positive integers, has at most w2 UJ ( n ' ) proper solu¬ 
tions, that is, solutions with gcd(x, y) = 1 . Note that w denotes the number of 
automorphs of the above form and uj{n) denotes the number of different prime 
factors of n. The number of automorphs is directly related to the discriminant 
of the form; specifically, w = 4 for the case B — 1 and w = 2 for B > 1. More¬ 
over, we are only interested in the case where x and y are both positive, and 
so it follows that equation d6j) has at most w 2 c d T d -2 proper solutions. Finally, 
noting that there will be at most 1 improper solution to (l6lh namely p = q, we 
can bound the overall number of solutions to (0 by w2 w ^~ 2 + 1. 

Furthermore, Theorem 11 of Robin HP gives us the explicit bound 


uj(n) < 1.3841 


logn 
log log n 


for all n > 3. Thus, for fixed n and B, it is easy to bound explicitly from above 
the number of solutions to (l6lh It remains to sum this bound over all valid 
values of B. However, we should note that given an integer n, there are not 
too many good choices of B, and this will allow us to make a further saving. 

This comes from the observation that every prime p > 3 satisfies p 2 = 
1 mod 24. For with p > 3 and q > 3, Equation (EJ) becomes 


B = n — 1 mod 24, 


and this confines B to the integers in a single residue class modulo 24. 

Formally and explicitly, we argue as follows. Consider first the case where 
B is an integer in the range 

n — 9 ^ 1 

~ < A 2 ' 

The leftmost inequality above keeps p < 3. Here, there are clearly at most 


integer values for B. We now consider the case where p > 3, and it follows 
that B = n — 1 mod 24. Clearly, then, there are at most 






values for B in this range. Therefore, in total, there are at most 


2 H-1- 

24 A 2 A 2 n 

values of B for which we need to sum the solution counts to Equation (EJ). 
Also, we must also consider that w — 4 for B — 1. Therefore, we have that 
the number of solutions to Equation (jSJ) summed over B is bounded above by 

2 w(n)-i f o I ^ ^ 

V + 24A 2 + A 2 n) ' 

Therefore, the number of primes p (including weights) which satisfy (|6D is at 
most 

E E logp < 2 I - 3s411 °e“ /1 “« lo «"(5 + —Lj + lo S n - (V 

Ay/n<q<y/n p<y/n 

n=p 2 mod q 2 


2.5 Collecting terms 

Now, collecting together (J2]l, (]4| ) , (J5j ) and (fTf), we have that the weighted count 
over all the so-called mischevious primes can be bounded thus 


E E logp < 

q<y/n p<y/n 

n=p 2 mod q 2 


0.568 + 0'^'^ 2c _|_ n c ) logn ) sjn 

~A C 


+ 


Ay/n\ogn / 1.2762 \ 

log {Ay/n} V + log (Ay/n)J 


, r)L38411ogn/loglogn / ^ ^ ^ 

1 2 48A 2 2 A 2 n 


log n. 


As expected, however, the weighted count over all primes exceeds this for large 
enough n and good choices of c and A. Dusart [6] gives us that 


6{x) > x — 0.2 



for all x > 3594641, and thus it follows that 


Q(y/n) > \fn — 0.8 


y/n 


log 2 n 
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for all n > 10 14 . Therefore, if we denote by R(n) the (weighted) count of 
primes p such that n — p 2 3 is square-free, it follows that 

R(n) > 


It is now straightforward to check that choosing c = 0.209 and A = 0.0685 
gives R{n) > 0 for all n > 2.5 x 10 14 . 

3 Numerical Verification for “Small” n 

We now describe a computation undertaken to confirm that all n ^ 1 mod 4, 
10 < n < 4 000 023 301 851135 can be written as the sum of a prime squared 
and a square-free number^ We will first describe the algorithm used, and then 
say a few words about its implementation. 

3.1 The Algorithm 

We aim to test 3 • 10 15 different n. We quickly conclude that we cannot afford 
to individually test candidate n — p 2 to see if they are square-free. There is 
an analytic algorithm [3] that is conjectured to be able to test a number of 
size n in time (9(exp([log?r] 2//3+ °d)) but this is contingent on the Generalised 
Riemann Hypothesis. We would be left needing to factor each n — p 2 , which 
would be prohibitively expensive. 

We proceeed instead by chosing a largest prime P and a sieve width W. 
To check all the integers in [N, N + W) we first sieve all the integers in [N — 
P 2 , N + W — 4) by crossing out any that are divisible by a prime square p 2 
with p < y/(N + W — 5)/2. Now for each n E [IV, N + W), n^k 1 mod 4 we 
lookup in our sieve to see if n — 4 is square- frecjU. If not, we try n — 9 then 
n — 25 and so on until n — p 2 is square-free. If it fails all these tests up to and 
including n — P 2 , we output n for later checking. 

2 This is a factor of 16 further than we actually needed to check, but we did not expect 
our analytic approach to fare as well as it did. 

3 Unless n = 0 mod 4. 


1 - 0.568 


0.00183 0.8 


Hi/nlogn / 


c log 2 n 
1.2762 \ 


— (n 2c + n c ) 


log n 


n 


log (Ay/n) V + log (Ay/n)J 

2l.38411ogn/loglogn| _ _|_ - _|_ 


3 1 

2 + 48 A 2 


2 A 2 n 


log n. 
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3.2 The Implementation 

Numbers of this size fit comfortably in the 64 bit native word size of modern 
CPUs and we implemented the algorithm in C++. We use a character array 
for the sievcQ, and chose a sieve width W = 2 31 as this allows us to run 16 
such sieves in parallel in the memory available. We set the prime limit P = 43 
as this was found to reduce the number of failures to a manageable level (see 
below). To generate the primes used to sieve the character array we used Kim 
Walisch’s PrimeSieve |13j . 

We were able to run 16 threads on a node of the University of Bristol’s 
Bluecrystal Phase III cluster [I] and in total we required 5,400 core hours of 
CPU time to check all n G [2048,4 000 023 301851135]. 4 915 n were rejected 
as none of n — p 2 with p < 43 were square-free. We checked these 4 915 cases 
in seconds using PARI |2] and found that p = 47 eliminated 4 290 of them, 
53 does for a further 538, 59 for 14 more, 61 for 61 (!), 67 doesn’t help (!), 
71 kills off 11 more and the last one standing, n = 1623 364493 706 484 falls 
away with p = 73. Finally, we use PARI again to check n G [10, 2047] with 
n ^ 1 mod 4 and we are done. 

It is interesting to consider the efficiency of the main part of this algorithm. 
The CPUs on the compute nodes of Phase III are 2.6GHz Intel® Xeon® pro¬ 
cessors and we checked 3 • 10 15 individual n in 5 400 hours. This averages less 
than 17 clock ticks per n which suggests that the implementation must have 
made good use of cache. 


4 We considered using each byte to represent 8 or more n but the cost of the necessary 
bit twiddling proved too heavy. 
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